An investigation of coreference phenomena in the biomedical domain
نویسندگان
چکیده
We describe challenges and advantages unique to coreference resolution in the biomedical domain, and a sieve-based architecture that leverages domain knowledge for both entity and event coreference resolution. Domain-general coreference resolution algorithms perform poorly on biomedical documents, because the cues they rely on such as gender are largely absent in this domain, and because they do not encode domain-specific knowledge such as the number and type of participants required in chemical reactions. Moreover, it is difficult to directly encode this knowledge into most coreference resolution algorithms because they are not rule-based. Our rule-based architecture uses sequentially applied hand-designed “sieves”, with the output of each sieve informing and constraining subsequent sieves. This architecture provides a 3.2% increase in throughput to our Reach event extraction system with precision parallel to that of the stricter system that relies solely on syntactic patterns for extraction.
منابع مشابه
Analysis of Coreference Relations in the Biomedical Literature
In this study, we perform an investigation of coreference resolution in the biomedical literature. We compare a state-ofthe-art general system with a purposebuilt system, demonstrating that the use of domain-specific knowledge results in dramatic improvement. However, performance of the system is still modest, with recall a particular problem (80% precision and 24% recall). Through analysis of ...
متن کاملDomain Adaptation with Active Learning for Coreference Resolution
In the literature, most prior work on coreference resolution centered on the newswire domain. Although a coreference resolution system trained on the newswire domain performs well on newswire texts, there is a huge performance drop when it is applied to the biomedical domain. In this paper, we present an approach integrating domain adaptation with active learning to adapt coreference resolution...
متن کاملCorpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملDomain Adaptation of Coreference Resolution for Radiology Reports
In this paper we explore the applicability of existing coreference resolution systems to a biomedical genre: radiology reports. Analysis revealed that, due to the idiosyncrasies of the domain, both the formulation of the problem of coreference resolution and its solution need significant domain adaptation work. We reformulated the task and developed an unsupervised algorithm based on heuristics...
متن کاملThe Taming of Reconcile as a Biomedical Coreference Resolver
To participate in the Protein Coreference section of the BioNLP 2011 Shared Task, we use Reconcile, a coreference resolution engine, by replacing some pre-processing components and adding a new mention detector. We got some improvement from training two separate classifiers for detecting anaphora and antecedent mentions. Our system yielded the highest score in the task, F-score 34.05% in partia...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1603.03758 شماره
صفحات -
تاریخ انتشار 2016